Skip to content

OCPBUGS-66898: Implement mTLS authentication and authorization for CVO metrics endpoint#1271

Merged
openshift-merge-bot[bot] merged 12 commits intoopenshift:mainfrom
DavidHurta:metrics-mtls
Feb 24, 2026
Merged

OCPBUGS-66898: Implement mTLS authentication and authorization for CVO metrics endpoint#1271
openshift-merge-bot[bot] merged 12 commits intoopenshift:mainfrom
DavidHurta:metrics-mtls

Conversation

@DavidHurta
Copy link
Contributor

@DavidHurta DavidHurta commented Dec 10, 2025

This pull request proposes to migrate the CVO metrics endpoint from bearer token authentication to mutual TLS (mTLS) to comply with OpenShift metrics conventions.

Key changes:

  • Replaced custom certificate watching logic with the k8s.io/apiserver/pkg/server/dynamiccertificates package to simplify logic
  • Authenticate clients using mutual TLS (mTLS)
  • Authorize clients by verifying CN
  • Add configurable authentication/authorization options for HyperShift compatibility
  • Update ServiceMonitor to use mTLS instead of bearer tokens

Breaking Changes:

  • Prometheus must now authenticate using client certificates signed by a trusted cluster CA. Clients without valid certificates are rejected during the TLS handshake.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 10, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link

coderabbitai bot commented Dec 10, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Switched metrics auth from bearer tokens to TLS client certificates with CN-based authorization; introduced dynamic serving-cert and client-CA controllers; added public MetricsOptions and updated RunMetrics/createHttpServer signatures; updated ServiceMonitor, go.mod, tests, startup flags, and call sites.

Changes

Cohort / File(s) Summary
Go module updates
go.mod
Reorganized dependencies: added direct k8s.io/apiserver v0.34.1, removed direct gopkg.in/fsnotify.v1 from primary require block, and added github.com/fsnotify/fsnotify v1.9.0 as an indirect dependency.
ServiceMonitor manifest
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Removed bearerTokenFile from the ServiceMonitor endpoint and configured TLS client-cert scraping via tlsConfig.certFile / tlsConfig.keyFile.
Metrics implementation
pkg/cvo/metrics.go
Replaced token-review flow with TLS client-cert auth and CN-based authorization; introduced public MetricsOptions; changed RunMetrics and createHttpServer signatures; added dynamic serving-cert and client-CA controllers and GetConfigForClient TLS wiring; removed tokenReview interface and file-watcher rotation.
Metrics tests
pkg/cvo/metrics_test.go
Rewrote tests to use generated TLS client certificates and a mock CA provider; expanded authn/authz scenarios (no security, authn-only, authz-only, both, untrusted/missing certs); removed token-based test scaffolding.
Startup, flags, and integration tests
pkg/start/start.go, cmd/cluster-version-operator/start.go, pkg/start/start_integration_test.go
Replaced top-level ListenAddr/ServingCertFile/ServingKeyFile with nested MetricsOptions on exported Options; updated flag bindings, validation, call sites, and integration test assignments to use MetricsOptions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@DavidHurta
Copy link
Contributor Author

/test ?

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 10, 2025
@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-ovn
/test e2e-hypershift
/test e2e-hypershift-conformance

1 similar comment
@DavidHurta
Copy link
Contributor Author

/test e2e-agnostic-ovn
/test e2e-hypershift
/test e2e-hypershift-conformance

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

DavidHurta commented Dec 13, 2025

Note: PromeCIeus and the e2e-hypershift job can be used to verify that the CVO metrics in a hosted control plane can be scraped (when HyperShift is specified to use the OCP monitoring).

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Dec 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/cvo/metrics.go (1)

201-299: Harden the listener TLS config: ensure MinVersion/ciphers apply even if GetConfigForClient returns nil.

The base TLS config is secured via crypto.SecureTLSConfig, but the listener uses a fresh tls.Config literal containing only GetConfigForClient. If the callback returns nil or fails, the listener operates with default TLS settings lacking explicit cipher suite and minimum version constraints. Wrap the listener config with crypto.SecureTLSConfig to guarantee hardened settings at the listener level.

-	tlsConfig := &tls.Config{GetConfigForClient: servingCertController.GetConfigForClient}
+	tlsConfig := crypto.SecureTLSConfig(&tls.Config{GetConfigForClient: servingCertController.GetConfigForClient})
 	go startListening(server, tlsConfig, listenAddress, resultChannel)

Also verify the process has RBAC to read kube-system/extension-apiserver-authentication and that client-ca-file is the intended trust root for Prometheus' client cert chain in all target environments.

🧹 Nitpick comments (3)
pkg/cvo/metrics_test.go (1)

4-6: CN-based TLS auth tests are aligned with the new implementation; consider using httptest.NewRequest for a fully-formed request.
Current http.NewRequest("GET", "url-not-important", nil) is probably fine for this handler, but httptest.NewRequest("GET", "https://example.invalid/metrics", nil) is more idiomatic and avoids edge cases if the handler later reads URL/Host.

Also applies to: 1030-1088

pkg/cvo/metrics.go (2)

126-143: Rename disableAuth to disableAuthorization to avoid confusion (authn vs authz).
Right now createHttpServer(disableAuth bool) is called with metricsOptions.DisableAuthorization, but the parameter name reads like “disable authentication”.

-func createHttpServer(disableAuth bool) *http.Server {
-	if disableAuth {
+func createHttpServer(disableAuthorization bool) *http.Server {
+	if disableAuthorization {
 		handler := http.NewServeMux()
 		handler.Handle("/metrics", promhttp.Handler())
 		server := &http.Server{
 			Handler: handler,
 		}
 		return server
 	}

149-166: CN authorization is very strict; confirm the allowed identity set is complete for all supported deployments.
Hardcoding only system:serviceaccount:openshift-monitoring:prometheus-k8s may be correct, but it will break scraping if the client cert CN differs in hosted control planes / alternate Prometheus setups. Consider allowing a small allowlist (still static) to reduce fragility.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between d4cb3b0 and 920ad71.

⛔ Files ignored due to path filters (69)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/fsnotify/fsnotify/.cirrus.yml is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/.gitignore is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/.mailmap is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/CHANGELOG.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/CONTRIBUTING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/LICENSE is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/README.md is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_fen.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_inotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_other.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/backend_windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/fsnotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_dragonfly.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_freebsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_linux.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_netbsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_openbsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_solaris.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/debug_windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/freebsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/internal.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/unix.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/unix2.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/internal/windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/shared.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/staticcheck.conf is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/system_bsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/fsnotify/fsnotify/system_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.editorconfig is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.gitignore is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/.travis.yml is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/AUTHORS is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/CHANGELOG.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/CONTRIBUTING.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/LICENSE is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/README.md is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/fen.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/fsnotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/inotify.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/inotify_poller.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/kqueue.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/open_mode_bsd.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/open_mode_darwin.go is excluded by !vendor/**, !**/vendor/**
  • vendor/gopkg.in/fsnotify.v1/windows.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/cert_key.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/client_ca.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_cafile_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_serving_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/dynamic_sni_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/interfaces.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/named_certificates.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/static_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/tlsconfig.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/union_content.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/apiserver/pkg/server/dynamiccertificates/util.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/OWNERS is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/doc.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/event_broadcaster.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/event_recorder.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/fake.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/helper.go is excluded by !vendor/**, !**/vendor/**
  • vendor/k8s.io/client-go/tools/events/interfaces.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (5)
  • go.mod (2 hunks)
  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1 hunks)
  • pkg/cvo/metrics.go (5 hunks)
  • pkg/cvo/metrics_test.go (2 hunks)
  • pkg/start/start.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml
  • pkg/cvo/metrics_test.go
  • pkg/start/start.go
  • pkg/cvo/metrics.go
  • go.mod
🧬 Code graph analysis (1)
pkg/start/start.go (1)
pkg/cvo/metrics.go (2)
  • MetricsOptions (196-199)
  • RunMetrics (208-299)
🪛 ast-grep (0.40.0)
pkg/cvo/metrics.go

[warning] 245-245: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 247-247: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{ClientAuth: tls.RequireAndVerifyClientCert}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 273-273: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{GetConfigForClient: servingCertController.GetConfigForClient}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🔇 Additional comments (5)
install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1)

14-21: Verify Prometheus mounts metrics-client-certs at the referenced file paths (otherwise scraping will fail).
ServiceMonitor now expects client cert/key at /etc/prometheus/secrets/metrics-client-certs/{tls.crt,tls.key}; ensure the cluster-monitoring-managed Prometheus pod has that secret mounted and populated in all target topologies (self-managed + hosted).

go.mod (1)

29-29: Dependency changes look consistent with dynamiccertificates adoption; please verify module graph is tidy and conflict-free.
Run a go mod tidy-equivalent in CI and ensure no remaining imports of gopkg.in/fsnotify.v1, and no k8s.io/* version skew introduced by the new direct k8s.io/apiserver requirement.

Also applies to: 42-42

pkg/start/start.go (1)

353-358: Double-check HyperShift semantics: this disables both client-cert auth and CN authorization.
Wiring to o.HyperShift is clean, but please confirm the hosted-control-plane scraping model really expects unauthenticated metrics over TLS (vs. mTLS with different client identity).

pkg/cvo/metrics.go (2)

18-20: Good move switching to dynamiccertificates; verify k8s.io/apiserver integration matches the vendored K8s minor version.
This import-level change is appropriate for hot-reloading certs/CA; just ensure no k8s.io/* skew and that dynamiccertificates APIs used here match v0.34.1 expectations.


196-199: MetricsOptions is a clean API shape for callers.

@DavidHurta
Copy link
Contributor Author

/test e2e-hypershift
/test e2e-agnostic-ovn

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Dec 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/cvo/metrics.go (1)

149-166: Certificate-based authorization correctly implemented.

The CN validation logic is sound: it checks for the presence of peer certificates and validates that the Common Name matches the expected Prometheus service account. The HTTP status codes (401 for missing certificate, 403 for unauthorized CN) are appropriate.

Note: The CN "system:serviceaccount:openshift-monitoring:prometheus-k8s" is hardcoded. Consider whether this should be configurable via MetricsOptions if other clients might need access in the future.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 920ad71 and ff6d96e.

📒 Files selected for processing (4)
  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1 hunks)
  • pkg/cvo/metrics.go (5 hunks)
  • pkg/cvo/metrics_test.go (2 hunks)
  • pkg/start/start.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/cvo/metrics_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • install/0000_90_cluster-version-operator_02_servicemonitor.yaml
  • pkg/start/start.go
  • pkg/cvo/metrics.go
🧬 Code graph analysis (1)
pkg/start/start.go (1)
pkg/cvo/metrics.go (2)
  • MetricsOptions (196-199)
  • RunMetrics (208-312)
🪛 ast-grep (0.40.0)
pkg/cvo/metrics.go

[warning] 256-256: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{ClientAuth: clientAuth}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)


[warning] 283-286: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{
ClientAuth: clientAuth,
GetConfigForClient: servingCertController.GetConfigForClient,
}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🔇 Additional comments (10)
install/0000_90_cluster-version-operator_02_servicemonitor.yaml (1)

14-21: LGTM! TLS client certificate authentication correctly configured.

The ServiceMonitor now uses TLS client certificates (certFile and keyFile) for authentication instead of bearer tokens, aligning with the mTLS implementation in pkg/cvo/metrics.go that validates the client certificate's CN.

pkg/cvo/metrics.go (8)

18-19: LGTM! Imports correctly updated for certificate-based approach.

The new imports for dynamiccertificates and kubernetes client replace the previous token-based authentication dependencies and support the dynamic certificate loading implementation.


126-143: LGTM! Simplified server creation aligns with TLS-based authentication.

The function signature correctly removes the authentication client dependency since authentication is now handled at the TLS layer via client certificates, with authorization delegated to the authHandler.


196-199: LGTM! Clean options struct for metrics configuration.

The MetricsOptions struct provides a clear, type-safe way to configure authentication and authorization behavior for the metrics server.


208-223: LGTM! Serving certificate controller correctly initialized.

The serving certificate controller is properly set up to watch and automatically reload the server's TLS certificate and key files. The RunOnce initialization before starting the background watcher ensures the files are loaded before the server starts.


225-249: LGTM! Client CA controller correctly set up when authentication is enabled.

The conditional creation and initialization of the client CA controller is appropriate—it's only needed when client certificate authentication is enabled. The controller watches the extension-apiserver-authentication ConfigMap in kube-system for CA changes, which aligns with Kubernetes' standard approach for extension API server authentication.


276-288: LGTM! Server and TLS configuration properly wired.

The HTTP server creation respects the DisableAuthorization setting, and the TLS configuration correctly uses GetConfigForClient to enable dynamic certificate updates. The ClientAuth setting in the final config ensures proper fallback behavior if the callback fails.


290-311: LGTM! Server lifecycle properly managed with graceful shutdown.

The implementation correctly handles the run/shutdown contexts, waits for server termination, performs graceful shutdown on context cancellation, and properly collects results from the server goroutine. Error handling and logging are appropriate.


252-267: This concern is not valid. crypto.SecureTLSConfig from github.com/openshift/library-go/pkg/crypto automatically sets MinVersion to the cluster's default TLS version (via DefaultTLSVersion()) when MinVersion is not explicitly set in the passed tls.Config. The code in metrics.go correctly relies on this behavior and does not need explicit MinVersion configuration.

Likely an incorrect or invalid review comment.

pkg/start/start.go (1)

353-357: LGTM! MetricsOptions correctly configured for HyperShift mode.

The creation of the MetricsOptions struct with both DisableAuthentication and DisableAuthorization set to o.HyperShift is appropriate. This ensures that in HyperShift deployments, both authentication and authorization are disabled together, while in standard deployments, both are enabled.

@DavidHurta
Copy link
Contributor Author

/retest

1 similar comment
@DavidHurta
Copy link
Contributor Author

/retest

@DavidHurta
Copy link
Contributor Author

/test e2e-hypershift-conformance
/test e2e-hypershift

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

/test all

@DavidHurta
Copy link
Contributor Author

@coderabbitai review

DavidHurta and others added 2 commits February 18, 2026 16:47
This is done to provide HTTP return values in failures to comply
with the origin test suite.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2026
@hongkailiu
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DavidHurta, hongkailiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [DavidHurta,hongkailiu]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dis016
Copy link

dis016 commented Feb 20, 2026

Test Scenario's:
Prometheus can scrape using mTLS

dinesh@Dineshs-MacBook-Pro ~ % oc get clusterversion 
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.22.0-0-2026-02-20-041325-test-ci-ln-z63ctmb-latest   True        False         59m     Cluster version is 4.22.0-0-2026-02-20-041325-test-ci-ln-z63ctmb-latest
dinesh@Dineshs-MacBook-Pro ~ % 

Prometheus should not scrape with bearer token.

 
dinesh@Dineshs-MacBook-Pro ~ % token=$(oc create token prometheus-k8s  -n openshift-monitoring)
dinesh@Dineshs-MacBook-Pro ~ % oc -n openshift-monitoring exec -c prometheus pod/prometheus-k8s-0 -- curl -i -H "Authorization:Bearer $token"  https://cluster-version-operator.openshift-cluster-version.svc.cluster.local:9099/metrics
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (60) SSL certificate problem: self-signed certificate in certificate chain
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
command terminated with exit code 60
dinesh@Dineshs-MacBook-Pro ~ % 

Client key is required with CA bundle


dinesh@Dineshs-MacBook-Pro ~ % oc -n openshift-monitoring exec -c prometheus pod/prometheus-k8s-0 -- curl -i --cacert /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt --cert /etc/prometheus/secrets/metrics-client-certs/tls.crt https://cluster-version-operator.openshift-cluster-version.svc.cluster.local:9099/metrics
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (58) unable to set private key file: '/etc/prometheus/secrets/metrics-client-certs/tls.crt' type PEM
command terminated with exit code 58
dinesh@Dineshs-MacBook-Pro ~ % 

@dis016
Copy link

dis016 commented Feb 20, 2026

Test Scenario: Verify CVO endpoint cluster-version-operator.openshift-cluster-version.svc_9099 is working well with new mTLS authentication.

https://github.com/openshift/openshift-tests-private/pull/28753#issuecomment-3931879266

@dis016
Copy link

dis016 commented Feb 20, 2026

/verified by @dis016

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Feb 20, 2026
@openshift-ci-robot
Copy link
Contributor

@dis016: This PR has been marked as verified by @dis016.

Details

In response to this:

/verified by @dis016

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@DavidHurta
Copy link
Contributor Author

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Feb 23, 2026
@DavidHurta
Copy link
Contributor Author

/retest

1 similar comment
@DavidHurta
Copy link
Contributor Author

/retest

@DavidHurta
Copy link
Contributor Author

/override e2e-agnostic-ovn-techpreview-serial

The failed ci/prow/e2e-agnostic-ovn-techpreview-serial runs look unrelated to the PR. Also, the PR code changes have already passed the job in a run for previous changes (5685780), which are identical to the current ones.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@DavidHurta: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • e2e-agnostic-ovn-techpreview-serial

Only the following failed contexts/checkruns were expected:

  • CodeRabbit
  • ci/prow/e2e-agnostic-operator
  • ci/prow/e2e-agnostic-ovn
  • ci/prow/e2e-agnostic-ovn-techpreview-serial
  • ci/prow/e2e-agnostic-ovn-upgrade-into-change
  • ci/prow/e2e-agnostic-ovn-upgrade-out-of-change
  • ci/prow/e2e-aws-ovn-techpreview
  • ci/prow/e2e-hypershift
  • ci/prow/e2e-hypershift-conformance
  • ci/prow/gofmt
  • ci/prow/images
  • ci/prow/lint
  • ci/prow/okd-scos-images
  • ci/prow/unit
  • ci/prow/verify-deps
  • ci/prow/verify-update
  • ci/prow/verify-yaml
  • pull-ci-openshift-cluster-version-operator-main-e2e-agnostic-operator
  • pull-ci-openshift-cluster-version-operator-main-e2e-agnostic-ovn
  • pull-ci-openshift-cluster-version-operator-main-e2e-agnostic-ovn-techpreview-serial
  • pull-ci-openshift-cluster-version-operator-main-e2e-agnostic-ovn-upgrade-into-change
  • pull-ci-openshift-cluster-version-operator-main-e2e-agnostic-ovn-upgrade-out-of-change
  • pull-ci-openshift-cluster-version-operator-main-e2e-aws-ovn-techpreview
  • pull-ci-openshift-cluster-version-operator-main-e2e-hypershift
  • pull-ci-openshift-cluster-version-operator-main-e2e-hypershift-conformance
  • pull-ci-openshift-cluster-version-operator-main-gofmt
  • pull-ci-openshift-cluster-version-operator-main-images
  • pull-ci-openshift-cluster-version-operator-main-lint
  • pull-ci-openshift-cluster-version-operator-main-okd-scos-images
  • pull-ci-openshift-cluster-version-operator-main-unit
  • pull-ci-openshift-cluster-version-operator-main-verify-deps
  • pull-ci-openshift-cluster-version-operator-main-verify-update
  • pull-ci-openshift-cluster-version-operator-main-verify-yaml
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

/override e2e-agnostic-ovn-techpreview-serial

The failed ci/prow/e2e-agnostic-ovn-techpreview-serial runs look unrelated to the PR. Also, the PR code changes have already passed the job in a run for previous changes (5685780), which are identical to the current ones.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@DavidHurta
Copy link
Contributor Author

/override ci/prow/e2e-agnostic-ovn-techpreview-serial

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@DavidHurta: Overrode contexts on behalf of DavidHurta: ci/prow/e2e-agnostic-ovn-techpreview-serial

Details

In response to this:

/override ci/prow/e2e-agnostic-ovn-techpreview-serial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@DavidHurta
Copy link
Contributor Author

/cherrypick release-4.21

@openshift-cherrypick-robot

@DavidHurta: once the present PR merges, I will cherry-pick it on top of release-4.21 in a new PR and assign it to you.

Details

In response to this:

/cherrypick release-4.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@DavidHurta
Copy link
Contributor Author

/override ci/prow/e2e-hypershift-conformance

The failed ci/prow/e2e-hypershift-conformance run looks unrelated to the PR. Also, the PR code changes have already passed the job previously in 2024149482949578752.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@DavidHurta: Overrode contexts on behalf of DavidHurta: ci/prow/e2e-hypershift-conformance

Details

In response to this:

/override ci/prow/e2e-hypershift-conformance

The failed ci/prow/e2e-hypershift-conformance run looks unrelated to the PR. Also, the PR code changes have already passed the job previously in 2024149482949578752.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 24, 2026

@DavidHurta: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 9a401d5 into openshift:main Feb 24, 2026
18 checks passed
@openshift-ci-robot
Copy link
Contributor

@DavidHurta: Jira Issue Verification Checks: Jira Issue OCPBUGS-66898
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-66898 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

This pull request proposes to migrate the CVO metrics endpoint from bearer token authentication to mutual TLS (mTLS) to comply with OpenShift metrics conventions.

Key changes:

  • Replaced custom certificate watching logic with the k8s.io/apiserver/pkg/server/dynamiccertificates package to simplify logic
  • Authenticate clients using mutual TLS (mTLS)
  • Authorize clients by verifying CN
  • Add configurable authentication/authorization options for HyperShift compatibility
  • Update ServiceMonitor to use mTLS instead of bearer tokens

Breaking Changes:

  • Prometheus must now authenticate using client certificates signed by a trusted cluster CA. Clients without valid certificates are rejected during the TLS handshake.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@DavidHurta: #1271 failed to apply on top of branch "release-4.21":

Applying: Run `go mod tidy & go mod vendor`
Using index info to reconstruct a base tree...
M	go.mod
M	go.sum
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
Removing vendor/gopkg.in/fsnotify.v1/windows.go
Removing vendor/gopkg.in/fsnotify.v1/open_mode_darwin.go
Removing vendor/gopkg.in/fsnotify.v1/open_mode_bsd.go
Removing vendor/gopkg.in/fsnotify.v1/kqueue.go
Removing vendor/gopkg.in/fsnotify.v1/inotify_poller.go
Removing vendor/gopkg.in/fsnotify.v1/inotify.go
Removing vendor/gopkg.in/fsnotify.v1/fsnotify.go
Removing vendor/gopkg.in/fsnotify.v1/fen.go
Removing vendor/gopkg.in/fsnotify.v1/README.md
Removing vendor/gopkg.in/fsnotify.v1/LICENSE
Removing vendor/gopkg.in/fsnotify.v1/CONTRIBUTING.md
Removing vendor/gopkg.in/fsnotify.v1/CHANGELOG.md
Removing vendor/gopkg.in/fsnotify.v1/AUTHORS
Removing vendor/gopkg.in/fsnotify.v1/.travis.yml
Removing vendor/gopkg.in/fsnotify.v1/.gitignore
Removing vendor/gopkg.in/fsnotify.v1/.editorconfig
Auto-merging go.sum
Auto-merging go.mod
Applying: pkg/cvo/metrics: Utilize dynamiccertificates package for certificate updates
Using index info to reconstruct a base tree...
M	pkg/cvo/metrics.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/cvo/metrics.go
CONFLICT (content): Merge conflict in pkg/cvo/metrics.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0002 pkg/cvo/metrics: Utilize dynamiccertificates package for certificate updates

Details

In response to this:

/cherrypick release-4.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.22.0-0.nightly-2026-02-26-092444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants